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Abstract 

Zheng and Tse have shown that over a quasi-static channel, there exists a fundamental tradeoff, 
known as the diversity-multiplexing gain (D-MG) tradeoff. In a realistic system, to avoid inefficiently 
operating the power amplifier, one should consider the situation where constraints are imposed on the 
peak to average power ratio (PAPR) of the transmitted signal. In this paper, the D-MG tradeoff of 
multi-antenna systems with PAPR constraints is analyzed. For Rayleigh fading channels, we show that 
the D-MG tradeoff remains unchanged with any PAPR constraints larger than one. This result implies 
that, instead of designing codes on a case-by-case basis, as done by most existing works, there possibly 
exist general methodologies for designing space-time codes with low PAPR that achieve the optimal 
D-MG tradeoff. As an example of such methodologies, we propose a PAPR reduction method based on 
constellation shaping that can be applied to existing optimal space-time codes without affecting their 
optimality in the D-MG tradeoff. Unlike most PAPR reduction methods, the proposed method does not 
introduce redundancy or require side information being transmitted to the decoder. Two realizations of 
the proposed method are considered. The first is similar to the method proposed by Kwok except that 
we employ the Hermite Normal Form (HNF) decomposition instead of the Smith Normal Form (SNF) to 
reduce complexity. The second takes the idea of integer reversible mapping which avoids the difficulty 
in matrix decomposition when the number of antennas becomes large. Sphere decoding is performed to 

The material in this paper was presented in part at the Annual Conference on Information Sciences and Systems (CISS), 
Princeton. New Jersey, Mar. 2008, and the IEEE International Symposium on Personal, Indoor and Mobile Radio Communications 
(PIMRC), Cannes, France, Sept. 2008. 
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verify that the proposed PAPR reduction method does not affect the performance of optimal space-time 
codes. 

EPICS MSP-STCD 
I. Introduction 

The results in |1] on the diversity-multiplexing gain (D-MG) tradeoff spurred numerous research 
activities towards the construction of space-time codes achieving the optimal tradeoff EJ-lSl. When 
examining these space-time codes, we find that these codes generally lead to high peak-to-average power 
ratio (PAPR) on each antenna. In practice, PAPR of the signals transmitted is an important parameter to 
be considered during hardware design. A high PAPR poses difficulties in the design of the amplifier and 
raises the cost of the transmitter. These practical issues motivate our study on the D-MG tradeoff of multi- 
antenna systems with PAPR constraints. For Rayleigh fading channels, our analytical result shows that the 
D-MG tradeoff remains the same with any PAPR constraints larger than one. To the best of our knowledge, 
this is the first analytical result in the literature on the D-MG tradeoff of multi-antenna systems with PAPR 
constraints. This result implies that, instead of designing codes on a case-by-case basis, as done by most 
existing works (e.g.. Ill), there possibly exist general methodologies for designing space-time codes with 
low PAPR that achieve the optimal D-MG tradeoff. As an example of such methodologies, we propose a 
PAPR reduction method based on constellation shaping that can be applied to existing optimal space-time 
codes without affecting their optimality in the D-MG tradeoff. Unlike most PAPR reduction methods, 
the proposed method does not introduce redundancy or require side information being transmitted to the 
decoder. In general, constellation shaping can be tailored to serve different purposes (e.g., minimizing the 
average transmission power) which often result in different shaping regions. The purposes of the proposed 
method are reduction of PAPR, and not affecting the rate and optimality in the D-MG tradeoff of the 
original code. For easier implementation and illustration, the target shaping region is a hypercube which 
will lead to an asymptotic PAPR of 3 when the constellation size is large. Lower PAPR might be possible 
with a different shaping region, which, however, might be difficult to implement. A similar approach was 
proposed in fTOl Chapter 5] for PAPR reduction of orthogonal frequency division multiplexing (OFDM) 
systems with constellation shaping based on the Smith Normal Form (SNF) ifTTI decomposition of integer 
matrices. Due to the prohibitive computational complexity of the SNF decomposition when the number 
of OFDM carriers is large, the author of ifTOll also considered discrete Hadamard transform (DHT) based 
multi-channel systems which rendered a low-complexity SNF decomposition. The authors of 1121 then 
took the constellation shaping algorithm derived for DHT-based systems and applied it to OFDM systems, 
in conjunction with a selective mapping (SLM) method which incurred redundant bits to overcome the 
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residual PAPR problem due to the mismatch in constellation shaping. 

Two realizations of the proposed method will be discussed. The first is similar to the one in |10|, 
except that we employ the Hermite Normal Form (HNF) decomposition |fT3l |[T4l instead of the SNF 
decomposition to reduce the computational complexity. The second takes the idea of integer reversible 
mapping ifTSl |[T6l which avoids the bit assignment problem in the above methods, and the difficulty 
in integer matrix decomposition when the size of the matrix becomes large. Therefore, this approach is 
more suitable for the situations where the number of transmit antennas or the number of OFDM carriers 
is large. Aside from these advantages over the methods in lITOl and lITll . it is also worth mentioning that 
our work is better justified because the integer-based constellation shaping is crucial in preserving the 
optimality of space-time codes, while for the uncoded OFDM application considered in |[TOl and |[T2l . the 
integer-based constellation shaping is not necessary, and its advantage over the non-integer-based shaping 
schemes (for example, the single carrier frequency division multiple access (SC-FDMA) lITTl scheme is 
equivalent to using a discrete Fourier transform (DFT) to shape the constellation) is yet for investigation. 
Note that the concept and derivation of the proposed method are very general, thus they can be applied to 
any linear transform based multi-channel modulation. For the space-time codes considered in this paper, 
simulation results using sphere decoding verify that the proposed PAPR reduction method does not affect 
the optimality of the codes. 

The rest of this paper is organized as follows. Section |ll] introduces the system model and the definitions 
of diversity and multiplexing gains. In Section |llll we analyze the D-MG tradeoff with any PAPR 
constraint larger than one, and show that, in Rayleigh fading channels, the D-MG tradeoff remains 
unchanged. In Section JVl a unified framework of approximate cubic shaping is described. In Section |V] 
and Section |Vll we propose two approaches of PAPR reduction via approximate cubic shaping. The first 
selects the transmitted signal using the HNF decomposition, while the second takes the idea of integer 
reversible mapping. Section I VII I provides some simulation results and discussions. Finally, conclusions 
are drawn in Section IVIIII 

II. System Model and Definitions 

A. System Model 

As in m, consider a wireless link with m transmit and n receive antennas. The fading coefficient /i,, 
is the path gain from transmit antenna j to receive antenna /. Let the channel matrix H = G C"^'". 
We assume that the fading coefficients are independent complex Gaussian with zero mean, unit variance, 
and known to the receiver, but not to the transmitter. We also assume that the channel matrix H remains 
constant within a block of / symbols. That is, the block length is much smaller than the coherence time 
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of the channel. Then the channel, within one block, can be written as 



Y=J:^HX + W (1) 

V m 

where X G C"^' has entries x,y, / = l,...,m,j = 1, ...,Z, being the signals transmitted by antenna / at time 
j such that the average transmission power on each antenna in each symbol duration is 1; Y e C"^' is 
the received signal; W is the additive noise with independent and identically distributed (i.i.d.) entries 
Wij ~ CA^(0, 1) (i.e., complex Gaussian with mean and variance 1); SNR is the average signal-to-noise 
ratio (SNR) at each receive antenna. A codebook C with rate R bits per second per hertz (b/s/Hz) is 
used, which has |C| =2^' codewords each being an m x / matrix. 

B. Diversity and Multiplexing Gains 

For the case without PAPR constraints on each antenna, in order to achieve a certain fraction of the 
capacity at high SNR, one should consider a family of codes that support a data rate which increases 
with log(5'A^7?). The diversity and multiplexing gains are defined as ID 

Definition 1: A diversity gain d*{r) is achieved at multiplexing gain r if the data rate R{SNR) satisfies 

,■ R{SNR) 

hm — ^ = r (2) 

SNR^oo logSNR 

and the outage probability Pout{R) satisfies 

lim '^^l^ = -d*{r) (3) 
SNR-.00 log SNR ^ ^ 

□ 

The function d*{r) characterizes the D-MG tradeoff. For convenience, we borrow the notation intro- 
duced in m to denote exponential equality. That is, f{SNR) = SNR'^ means 

SNR^^ log SNR 

> , < are similarly defined. 

III. Diversity-Multiplexing Gain Tradeoff with PAPR constraints 

When space-time codes are used in a multi-antenna system, due to the coding procedure which 
combines the information symbols to form the coded symbols for each transmit antenna, high PAPR 
values may occur, especially when the number of transmit antennas is large. To reflect the limitations of 
practical communication systems, we take PAPR into consideration and investigate the effect of PAPR 
constraints on the D-MG tradeoff. 
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A. The Behavior of Capacity at High SNR with PAPR Constraints 

For the study on the optimal D-MG tradeoff with PAPR constraints, characterization of the multiplexing 
gain is needed. That is, we need to know how the capacity grows with SNR. However, the expression 
of the exact capacity of a multi-antenna channel with inputs subject to average total power and PAPR 
constraints may not be a closed form, or may be too complicated (for the single antenna scenario with 
average power and peak power constraints, see ifTSll . |fT9l ). Fortunately, since the D-MG tradeoff is 
an asymptotic tradeoff, what we need is simply the behavior of the capacity for asymptotically large 
SNR. In this section, we will derive a lower bound of the capacity with average total power and PAPR 
constraints. The bound is tight enough for the derivation of the D-MG tradeoff. The capacity without 
PAPR constraints (already known in |[20l . |[2ll ) can be used as an upper bound. These two bounds are 
then used to characterize the capacity for large SNR. 

Since the channel remains constant within a block, the capacity achieving signal and average power 
distribution should not favor one symbol duration over another within the same block. Thus, for the 
purpose of analyzing the capacity with respect to the average SNR, it suffices to focus on any symbol 
duration within a block. We take the signal and noise vectors in (dJ pertaining to the same symbol 
duration, and drop the time index to form a new vector channel model 

y = Hx + w (4) 

where x G C" is the transmitted signal vector scaled by the transmission power, y G C" is the received 
signal vector, and the additive noise vector w has i.i.d. entries w, ~ CA^(0, 1). The average total power 
and PAPR constraints of the transmitted signal x are P > and p, > 1, / = 1, . . .m, respectively, such that 

Tr(£4xx^]) <P, (5) 

^^<P„ / = 1,..-, (6) 

where Tr() denotes trace and x^ denotes the conjugate transpose of x, [] denotes the expectation with 
respect to the distribution of t, and xi is the /-th element of x. With these definitions, we have the following 
lower bound on the capacity of this channel. 

Lemma 1: The ergodic capacity C of the channel (|4]) with the transmitted signal subject to ([5]), ([6]) is: 



C>En 



( P M 
logdet I + -HHn ^Yh 



(7) 

! = 1 



where ki are constants defined in Appendix [A] 
Proof: The proof is given in Appendix lAl 
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B. Optimal D-MG Tradeoff with PAPR Constraints 

Now we are ready to discuss the D-MG tradeoff with PAPR constraints. We have 
Lemma 2: For the Rayleigh fading channel, the ergodic capacity C of the channel dJ]) with transmitted 
signal subject to (|5]l, Q is 



C = min(m,?i)log(5A^7?). 
Proof: Let be the capacity without PAPR constraints. It is well known that ll20l lIlTI 



(8) 



logdetl I + -Hff 
m 



Using Coo as an upper bound, from ([7]l, we have 

m 

Coo + ^A:,<C<C. 

and clearly. 



min(m, «) log(5'A^/?) . 



Thus, 



Coo + 52 = min(m, log{SNR) . 
1=1 

C = min(m,«)log(5A^/?). 



Lemma |2] shows that the multiplexing gain r remains the same even with PAPR constraints. The main 
result is given in the following theorem. 

Theorem 1: For the Rayleigh fading channel, the optimal D-MG tradeoff with any PAPR constraint 
p > 1 is the same as the case without PAPR constraints . 
Proof: The outage probability is 

P«„,(7?)=minP[/(x;y|H)<7?] 

A(x) 

<p 



logdet ( I + -HH^ j+Y,k<R 



1=1 



= P 



logdet (I + SNRUU^) +Y^ki<R 

;=1 



(9) 



where /( ; ) denotes the mutual information and /x(x) is the probability density function of x subject to 
equations ^ and The inequality follows from (1631 ) and ^ follows from equation (9) in HI. Using 
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the same techniques as in [T|, denoting X, as the nonzero eigenvalues of HH^ and letting R = rlogSNR, 

in 

I ki = K, Xi = SNR-'^-, (x)+ = max(x,0), we have 
1=1 

P [(logdet (5A^7?HH'^ + I) +K) < rlogSNR] 

n(i+5m))<^ 



=p 



L(l-«,)+<r 



(=1 



(10) 



Thus Pout (R) < (flOl ) and (ITOl ) is exponentially equal to the outage probability without PAPR constraints in 
ifTll . However, the outage probability with PAPR constraints should be larger than the outage probability 
without PAPR constraints, that is, Pour (7?) > (ITOl ). Thus Pout{R) = (flOl) . and the optimal tradeoff remains 
the same as the case without PAPR constraints. ■ 
Intuitively, this result is not surprising, since the PAPR constraints do not reduce the spatial degree of 
freedom and the capacity C grows like Coo with increasing SNR. 

To show that this optimal tradeoff can be achieved by a code with finite code length, we adopt a 
similar method as in 11] by choosing the input to be a random code drawn from i.i.d distribution (1751) . 

Theorem 2: For l>m + n — \, in Rayleigh fading channels with any PAPR constraint p > 1 , the optimal 
D-MG tradeoff is achievable. 

Proof: The proof is given in Appendix O ■ 

IV. Approximate Cubic Shaping 

In this section, we discuss the fundamental concepts of the shaping techniques we use to reduce the 
PAPR of existing D-MG optimal space-time codes. A constellation generally consists of a set of points 
on an Z-dimensional complex lattice, or an L-dimensional real lattice Xl (where L = 21), that are enclosed 
within a finite region The boundary of a signal constellation affects the average power and PAPR 
for a given transmitted data rate. In selecting the signal constellation, one tries to minimize the average 
power with low PAPR. The L-dimensional constellation consisting of all the points enclosed within an L- 
dimensional cube is called cubic shaping, which leads to a PAPR value equal to 3 when the constellation 
size approaches infinity. With the same number of points to be transmitted, the reduction in the average 
transmission power due to the use of a region as signal constellation instead of a hypercube is referred 
to as the shaping gain rj^ of The region that has the smallest average power for a given volume is 
an L-dimensional sphere. Although the sphere shaping gives the best shaping gain, it also results in high 
PAPR values when L is large. Shaping of multidimensional constellation has been extensively studied 
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previously Il22l - |[25l . For our interest in PAPR reduction, we will focus on the cubic shaping due to its 
good PAPR value asymptotically equal to 3. 

Consider the shaping on a general space-time code X in the form of 



Gs, 



s e 



Gg 



tMxM 



(11) 



where x is an isomorphic vector representation of X, G is an invertible generator matrix and s is the 
vector of information symbols chosen from M-dimensional integer lattice Z^. A Quadrature Amplitude 
Modulation (QAM) constellation is a subset of a scaled integer lattice 'Z'^. 

For example, an mxm D-MG optimal space-time code X proposed in [7 1 can be expressed in terms 
of m vectors x^'', / = 1,2,..., m, 

xW = gW sW 

where stands for the Gaussian integers (i.e. a + bi,a,b E Z) and each x^'' corresponds to the symbols 
in the space-time codeword matrix with positions corresponding to the nonzero elements' positions of 
B'~', where 

^0 ••• 1^ 
1 ••• 
1 ••• 



B 



\0 ••• 1 uy 

Note that although these symbols are on different time and different antennas, they have equal average 
power with respect to all codewords owing to the code structure Q. For each s^'', G^'^, we can get the 
isomorphic representation by separating the real and imaginary parts of s^'^ and G*^'' as follows. 



1 



and 



^(0 



AO 



I Re ^Im 



'Im '-'Re 

Then x'(') = G'^'^ s'('), where s'(') G Z^"', G'^'^ G ^2mx2m_ jj^j^ exactly the same form as 

Our goal is to shape the transmitted signals such that the constellation region of x is cubic. However, 
as the constellation points of x have to be on the lattice that achieves the optimal D-MG tradeoff, the 
constellation region will not be exactly cubic. The idea is to shape by cosets. Since the information symbol 
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s's lattice is more regular (integer lattice) and easier to label, cosets will be found in the domain of s 
using a basis corresponding (approximately) to the cubic basis for x. The following two steps illustrate 
how this can be done. 

Step 1: Introduce a set of perturbation vectors U such that each u £ ?7 is a Unear combination of 
vectors v', i = I ...M 

u = ttiv' +a2V^ + ... + aMV*^ (12) 

where a = [ai,a2, ■■■,(XmY S and v' has the properties that 

Gv' = v', / = 1,2,...,M 

(13) 

v' = [£i,£2,...,^/,...,eM]^, h'l >>|e7l, 

and 

|^i|^|^2| = ... = |a'm|. (14) 

Let (s + u),Vu e ?7, be the coset representing the same information. 

Step 2: Choose (s + u*) as the vector of information symbols such that the transmitted signals, x = 
G(s + u*) consist of an approximate cubic constellation. The possible transmitted signals can be written 
as 

G(s + u) = Gs + Gu = s + u 

= s+(aiv^+a2V^ + ... + aMv'^) (15) 

where s = Gs, u = Gu. In this particular set U, each u causes relatively large perturbations on certain 
elements of s where the corresponding a, 7^ 0. If we treat £,'s as 0, to put x in the cubic constellation, u* 
can be searched accordingly by modulo operations. However, the mapping from s to x has to be reversible 
for successful decoding. In other words, these approximations need to be reversible. In the following two 
sections, we will propose two such mappings for approximate cubic shaping. An approximate hypercube 
constellation leads to a low PAPR value of 3 when e/s are relatively small, i.e., when the constellation 
is large enough. 

V. APPROXIMATE CUBIC SHAPING VIA HERMITE NORMAL FORM (HNF) DECOMPOSITION 

Firstly, we need to decide v', / = 1,2,...,M, in (fT2l ). Consider a partition Z'^/A, where the lattice 
A = QZ^, and Q is an M x M integer matrix such that 

GQ ^ al. 
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The approximation is due to the fact that Q is an integer matrix. If we choose v' as 

v' = Qe' (16) 
e' = [0(i),...,0(,-_i),l(,-),0(,-+i),...,0(M)]^, 

clearly, v' has the properties (fT3] ) and (fT4l) when a is reasonably large. Thus, 

v' 4 Gv' = GQe' 

= [8i,£2,...,ai,...,8M]^, |a;|>>|8j| (17) 

^ ae'. 

Define U = QZ'^. We can rewrite ([TSl l in terms of coset s + ?7 

G(s + [/) = Gs + GQZ^ 
= s + GQZ^ 

^s + aZ^. (18) 

Approximate cubic shaping can be done by treating Sy's as (equivalently, the approximation in ([TS] ) as 
equality), then searching for u* G QZ*^ to put x in the approximate cubic constellation. 

A geometric interpretation of this shaping method is that we choose s + u* in a shaped constellation 
whose boundary is a parallelotope defined along the columns of Q. Thus the signal boundary in the 
domain of x translates to an approximate hypercube. In the following, we will describe the shaping 
process in three parts: (1) determine Q (2) find the coset leaders (3)put x in an approximate hypercube, 
which are derived in a different point of view from similar works proposed in |[T2l and lITOl . Note that 
|[T2]| and |[TOl deal with the PAPR of single-antenna OFDM systems but not D-MG optimal space-time 
coded systems whose transmitted signals need to be on certain lattices. 

(1) Determine Q: The number of cosets, |det(Q)| (which will manifest in part (2)), must be large 
enough to support the target number of points we want to transmit. Therefore, let 

Q=[aG-^] 

|det(Q)| >cr^ 

where [ ] denotes rounding, which makes the set of perturbation vectors u belong to the integer lattice 
Z*^, and |det(Q)| is the volume of the parallelotope defined by Q, or equivalently, the number of points 
in the parallelotope. is the number of transmitted points. The parameter a should be chosen to be 
the smallest value that ensures the number of points in the shaped constellation larger than the number 
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of points in the unshaped constellation, so no information will be lost. For the case we concern, Q can 
always be chosen as a nonsingular matrix when a is large enough. 
(2) Find the coset leaders : The coset leaders s must satisfy 

if s'^sj 

(19) 

then sVs-' + Qz, Vz G 

where s', are coset-leaders of two different cosets s' + QZ'^, s-' + QZ'^, respectively, so there is no 
ambiguity in decoding. As an example, consider the simplest case when 

Q = D = diag{di,d^,...,dM)- 
Denoting S as the set of coset leaders, it is natural to choose S as 



S = {s\0< Si <di , /=1,2,...,M} 



(20) 



where s = [si,S2, ...jSmY ■ Obviously, the coset-leaders s € S satisfy ( fT9l ) and S contains all the coset 
leaders. The number of coset leaders is equal to |det(D)|. For example, 

/ 1 ^ 
2 



if D 



V 



then S 



0,0,0]^ [0,0, 1]^ [0,0,2]^ 
[0,1, 0]^ [0,1,1]^ [0,1,2]^ 



3 

and the number of coset-leaders in S is det(D) = 6. 

For the general case when Q is not a diagonal matrix, decompose Q into 

Q = UDV 



(21) 



where U,V are unimodular matrices (i.e. integer matrices with |det(U)| = 1, |det(V)| = 1). The matrix 
D is called the Smith Normal Form (SNF) of the matrix Q ifTTI . We can first index the coset leaders as 
(|20l ), and left-multiply s by U such that Us is the coset leader of Us + UDZ*^. Define 

Su = {Us| < 5,- < t//, /= 1,2,...,M}. (22) 

Since U is a unimodular matrix, Su contains all coset leaders of Us + UDZ'^, and 

Us + UDZ^ = Us + UD(VZ^) 

= Us + QZ^ (23) 

where the second equality follows from the fact that the lattice VZ^ is identical to the lattice when 
V is a unimodular matrix. Thus, Su contains all coset-leaders of Us + QZ^. 
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The SNF decomposition can be performed via column and row operations, which generally have the 
problem of intermediate expression swell. One can use modular arithmetic to control expression swell 

m. 

After examining the above algorithms, we find that the diagonalization of SNF decomposition is not 
necessary. Instead, we can decompose Q as 

Q = RV (24) 

where V is unimodular and R is an integer lower triangular matrix. There is a theorem that guarantees 
the existence of the decomposition of Q = RV, known as the Hermite Normal Form (HNF) fT3l . The 
theorem is stated here for completeness. 

Theorem 3: Any M xM invertible integer matrix Q can be decomposed into Q = RV, where V is a 
unimodular matrix and R is an integer lower triangular matrix. 

Let r„- 7^ be the diagonal elements of R. Then we can form the set of coset-leaders, S as 

S = {s|0<5,<r,7}. (25) 
The validity of this set of coset-leaders can be verified by the following theorem. 

Theorem 4: Given a matrix Q = RV, the set S defined in (l25l) contains all the coset leaders of s + QlJ^ . 

Proof: From (|23] ). the coset leaders of s + QZ'^ are the coset leaders of s + RZ*' since 

s + QZ^ = s + R(VZ^) 

= s + RZ^. (26) 
To show that each s G S is a valid coset leader, we need to prove that for s', s' G S, 

if s' = s^' + Rz, zgZ'^ 
then s' = 

The proof goes by induction. Let z = [zy,Z2t--,ZmY , fij be the entries of R. Note that from (l25l) . if 
s' =s^ + Rz, then ^'j = s[, z\ =0. Suppose s[ = sj^ for k= 1,2, ...m— 1 and Zk = 0. Then 

m 
k=l 
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which completes the induction. Finally, S contains all the coset leaders since |det(Q)| = |det(R)|. This 
completes the proof. 

■ 

(3) Put X in an approximate hypercube: Since the coset leader in S is not necessarily in the 
parallelotope enclosed by the columns of Q. We need to do the modulo-Q operation in (l27l ) to put 
s in the shaped constellation and transmit x = Gs. 

y= LQ-'sJ 

(27) 

s = s - Qy 

where [ J denotes the floor function. As a side note, it is desirable to translate § to minimize the transmit 
power (i.e., make E [x] = 0). In this paper, however, we only concern the shape of the constellation. 

Now we summarize the algorithm using HNF decomposition as follows: 
Encoding : Let s defined in (|25] ) be the canonical representation of an integer / which represents the 
data to be sent, s can be obtained by the following recursive modulo operation 

Si=I mod rii 

/, = ^ 

(28) 

s; = li- 1 mod r,/ 

_ h-l-Sj 

i'l — 

rn 

where 2 <i <M. Then use the algorithm defined in (l27l ) and transmit x = Gs. 

Decoding : First, an estimate of s is obtained from the received signal (using, e.g., sphere demodulation). 
Let r, be the i-th column of R. The decoding algorithm can be arranged to be top-down 

si = si mod rn (si = si + qirn) 

for / = 2 : M 

s= s + ^,_ir,_i (29) 
Si = Si mod ru {si = Si + qirn) 
end 

Compared to the similar approaches proposed in fTO\, our method can save the multiplication of Us in 
(l22l) in encoding, and half of the multiplications in decoding due to the lower triangular matrix, although 
both schemes have the same order of complexity 0{M^). Moreover, sometimes U may have exceedingly 
large entries. Our scheme only requires R and it is more efficient to only compute R [|141 . 
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VI. APPROXIMATE CUBIC SHAPING VIA INTEGER REVERSIBLE MATRIX MAPPING 

In practical communication systems, the number of points in the constellation usually equals to a 
number that can be expressed by an integer number of bits. That is, the constellation has 2*^ points, 
where ^ is a positive integer. In Section |Vl we chose Q = [aG~'] to ensure that Q is an integer matrix. 
However, |det(Q)| is generally not in the form of 2^ due to the rounding operation. This leads to the 
inconvenience of using large / in the encoding procedure (|28] ). since it can not be expressed in terms of 
bits. To avoid this problem, we relax the integer constraints on the entries of Q and consider a nonlinear 
mapping. Let the unshaped constellation be a hypercube, namely 

S = {s|0<5, <a,V/}. (30) 

Clearly, the total number of transmitted points is o'^ and we can choose a = 2^^l^\ Transform S into a 
shaped constellation Sq 

Sq = {Qs|0<5, <a,V/}. (31) 

where Q = G~' and |det(Q)| is normalized to 1. Then the x-domain shaped constellation GSq is 
transformed back to a hypercube 

GSq = {GQs = s| < 5, < a,V/}. 

The problem of dSTT ) is that Qs ^ Z*^. This wiU destroy the optimahty of the transmitted signal. Naturally, 
one method to try is 

[SQ]={[Qs]|0<^,<a,V/}. (32) 
However, there is a possibility that, for s',s' G S, 

s' ^ s' but [Qs'] = [Qs-'] . (33) 



To resolve the ambiguity, we choose an integer to integer reversible mapping [151, through which valid 
shaped symbols can be found. Furthermore, the shaped constellation will be similar to that using ( [32l ). 

Firstly we borrow some definitions from fTSl. If there exists an elementary reversible structure based 
on a matrix for perfectly invertible integer implementation, the matrix is called an elementary reversible 
matrix (ERM). Consider an upper or lower triangular matrix A whose diagonal elements are j, = ±1, a 
reversible integer mapping is defined as follows |15|: 

Let A be an M X M upper triangular matrix with elements {«,„„}, and y = [As], that is, 

M 



Jm^n 



Jm — JmSm- 



n=m+\ 



,m = 1,2,3, ...,M- 1 

(34) 
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The inverse mapping from y to s is 

( 1 / jm ) 



yn 



M 



(35) 



\ n=m+\ 

m = M~ l,M-2,...l. 

Similar results can be obtained for a lower triangular matrix. This kind of triangular matrix is called a 
triangular ERM (TERM). If all the diagonal elements of a TERM equal to 1, the TERM will be a unit 
TERM. There is another feasible ERM form known as the single-row ERM (SERM) with = ±1 on 
the diagonal and only one row of off-diagonal elements are not all zeros. The reversible integer mapping 
of SERM is straightforward: 

M 



ym' — Jm'^m' ~t~ 



n^m' 



, for m = m' 



(36) 



ym=jmSm, otherwisc 
where m' is the row with nonzero off-diagonal elements. The inverse operation is 

Sm =ym/jm, foV 111 ^ 111 



(37) 



Denote So as a unit SERM with m' = M. It has been shown in |[T5l that Q has a "PLUS" factorization. 





' M 




Sm' = i^m' — 




^ / im 









Theorem 5: Matrix Q has a TERM factorization of Q = PLUSq if and only if det(Q) = det(P) = ±1, 
where L, U are unit lower and unit upper TERMs, respectively, and P is a permutation matrix subject 
to a possible negative sign. 

From (|3T]) . clearly, Q satisfies the property that det(Q) = ±1. Now, we summarize the shaping algorithm 
using the PLUS factorization. 

Encoding : In contrast to (l32l) . we decompose Q into Q = PLUSq to obtain an integer to integer 
reversible mapping. The shaping algorithm is 

s = P [L [U [Sos] ] ] , s e S (38) 

(39) 

where S is defined in ( [30l) . Then x = Gs is transmitted. 
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Decoding : First, an estimate of § is obtained from the received signal. Then the inverse operations 
(|35] ). ( |37] ) and are used to recover s from s. 

For a Q = PLUSo with a denotation of Cr^'' for the rounding error vector that results from the transform 
of the i-th ERM, the total error due to reversible integer mapping is 

|er| = |P(er(^'+Ler(2'+LUer('^)| (40) 

and § = Qs + Cr- When using (1381 ) to shape the constellation, if we view it as a linear operation (as the 
constellation becomes large, the effect of rounding is relatively minor), we actually choose v' defined in 
(fT3]) as 

v'=P[L[U[Soae'] ] ] 

(41) 

^' = [0(i))"-)0(i--i))l(i)'0(,-+i),"-,0(M)]^- 

From (glj, 

Gv'=v' = ae' + Ger. 

Obviously, v' satisfies the property ([T3] ) when a is large enough. Thus this method is also an approximate 
cubic shaping described in Section JV] The complexity of (1381 ) is about 0{M^), which is smaller than 
0{1M^) of (l27]l. Moreover, if there is an efficient algorithm to do the multiplication by Q, the complexity 
can be further reduced. For example, when Q is a discrete Fourier transform (DFT) matrix, we can use 
a structure similar to FFT to obtain a more efficient algorithm with complexity O(MlogM) ll26l . The 
drawback of this method is the accumulated rounding error. This leads to some signals with relatively high 
PAPR. However, we can still expect that the shaped signals have low PAPR values with high probability. 
It is more convenient and better for shaping to use the complex representation. Thus (ITTI ) becomes 

x = Gs, se (Z[/])^,GGCTxf. (42) 

When using the complex representation, the corresponding j,„ in SERM and TERM can be ±1 or it/ 
and [ ] denotes rounding the real and imaginary components individually. The inverse operations (1351) . 
(l37l) still work. There is a corresponding theorem ifTSl as follows 

Theorem 6: Matrix Q has a factorization of Q = PLDrUSo if and only if det(Q) = det(DR) ^ 0, where 
Dr = diag{l, 1, l,e'^), L,U are lower and upper TERMs, respectively, and P is a permutation matrix. 
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If det(Q) = ±1 or ±i, we have a simplified factorization, Q = PLUSq. It is in fact a generalization of 
the lifting schemes in |27!|. 

When det(Q) = e'^ is not equal to ±1 or ±i, a complex rotation e'^ can be implemented with the real 
and imaginary components of a complex number and factorized into three unit TERMs as 

COS0 — sin0 
sin0 COS0 




1 (cos0- l)/sin0 
1 

Therefore, Theorem |6] shows that given a nonsingular matrix, we can always derive an integer reversible 
mapping by a factorization, which is what we need for constellation shaping. 

VII. Simulation Results 

In this section, we present simulation results for shaping of space-time codes designed in |7 1 and f2E\ 
by using 10^ randomly generated symbols. Since the signals transmitted by the antennas have similar 
statistical distributions, the simulation results are presented as the average complementary cumulative 
density function (CCDF) of the R\PR of signals on each antenna /, expressed as follows: 

CCDF{PAPR{xi) } = P{PAPR{xi) > p,} , (43) 

II" 

where PAPR{xi) = E^^y This can be interpreted as the probability that the PAPR of a symbol xi exceeds 
a certain PAPR constraint, p,. 

We first look at the 4x4 space-time code designed in flSl, which achieves the D-MG tradeoff. 
Fig. [T] shows the CCDF of the PAPR on 4 antennas using the HNF and PLUS approximate cubic 
shaping introduced in Section |V] and Section |Vll respectively. The effect of the constellation size is also 
investigated. When the constellation size is moderate (64 QAM), it is observed that the HNF shaping 
method results in about 1.3dB larger reduction in the PAPR than the PLUS shaping, which provides 
about 2dB PAPR reduction. The PLUS shaping has a worse performance due to the accumulation of 
rounding errors ( |40b . As the constellation size becomes large (dense), we can expect that the rounding 
error becomes relatively small, and both methods' PAPR will approach the optimal value for cubic 
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HNF shaping 


PLUS shaping 


64QAM 
256QAM 


4.9% 


4.6% 

3.5% 



TABLE I 

Increased average power for a 4 x 4 space-time code f28l using HNF and PLUS approximate cubic shaping. 





HNF shaping 


PLUS shaping 


64QAM 
256QAM 


5.4% 


4.8% 
3.2% 



TABLE II 

Increased average power for a 5 x 5 space-time code [JJ using HNF and PLUS approximate cubic shaping. 



shaping, namely 10/og3 = 4.78dB. This trend is shown by the curves of the PLUS shaping. The HNF 
shaping result with 256 QAM was not obtained due to its excessively high computational complexity. 
Table U shows the increased average power (compared to the average power without shaping) due to the 
few points outside the hypercube. As the constellation size becomes large (and more cubic), the power 
increment decreases. 



10" 










Without Stiaping 
^64QAM, 256QAI« 




10"' 




— PLUS, 64QAM 
^HNF, 64QAM 




10-^ 




-»-PLUS. 256QAIVI 




10-^ 












10^ 










\ • \ 


10 








10"^ 





0123456789 
PAPR(dB) 



Fig. 1. CCDF of PAPR for a 4 x4 space-time code L28J using HNF and PLUS approximate cubic shaping. 

In Fig. |2j we investigate the 5x5 space-time code given in Q which also achieves the D-MG tradeoff. 
Similar trends as in the 4x4 case can be observed. 

Finally, Fig. [3] presents the codeword error probability (CEP) of systems with 4 or 5 receive antennas 
and 4 or 5 transmit antennas in quasi-static Rayleigh fading channels. Here, we use the perfect space-time 
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Fig. 2. CCDF of PAPR for a 5 x 5 space-time code [VJ using HNF and PLUS approximate cubic shaping. 



codes in Q and ||28l for the 4x4 and 5x5 channels, respectively. The codeword sizes are also 4x4 
and 5x5 symbols, respectively. The sphere decoder in 1291 is used for lattice decoding. The results show 
that the space-time codes after shaping yield almost indistinguishable error performance compared to the 
performance without shaping. 



0° 





















CEP for 4x4 channel(24bDcul with shaoina 


^^CEP for 4x4 channel(24bpcu) without shaping 
^^Outage prob. for 4x4 channel(24bpcu) 
- a - CEP for 5x5 channei(30bpcu) with shaping 






— 1 — CEP for 5x5 channei(30bpcu) without shaping 




—^Outage prob. for 5x5 channel(30bpcu) 







12.5 15 17.5 20 22.5 25 27.5 30 32.5 

SNR(dB) 



Fig. 3. Codeword error probability for Rayleigh fading channel with or without shaping. 



VIII. Conclusion 

In this paper, we first showed that, for Rayleigh fading channels, the D-MG tradeoff remains unchanged 
with any PAPR constraints larger than one. This result implies that, instead of designing codes on 
a case-by-case basis, as done by most existing works, there possibly exist general methodologies for 
designing space-time codes with low PAPR that achieve the optimal D-MG tradeoff. As an example 
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of such methodologies, we proposed a PAPR reduction method based on constellation shaping that 
can be applied to existing optimal space-time codes without affecting their optimality in the D-MG 
tradeoff. Unlike most PAPR reduction methods, the proposed method does not introduce redundancy 
or require side information being transmitted to the decoder. Two realizations of the proposed method 
were considered. The first utilizes the Hermite Normal Form decomposition of integer matrices. The 
second utilizes the integer reversible mapping. Compared to the previous works |I T2l ifTOl which applied 
a similar approach (Smith Normal Form) to the single-antenna OFDM systems, the proposed method has 
lower complexities. In addition, even though |[T2l managed to reduce the complexity to the same order 
0{MlogM) as the proposed integer reversible mapping scheme (in the single-antenna OFDM case) by 
using a Hadamard matrix, that approach affects the PAPR reduction capability and only works for OFDM 
systems. The proposed method, on the other hand, works for any nonsingular generator (modulation) 
matrix and can achieve better PAPR reduction. Sphere decoding was performed to verify that the proposed 
PAPR reduction method does not affect the optimality of space-time codes. 

Appendix A 
Proof of Lemma 1 

Following the method in ||20l . since the receiver knows the realization of H, the channel output is the 
pair (y,H). The mutual information between channel input and output is then 

/(x;(y,H))=/(x;H)+/(x;y|H)=/(x;y|H). (44) 

Denote h{x) as the differential entropy of x and let H he a particular realization of H. For this H, when 
the SNR is asymptotically large, the output differential entropy /i(y|H = H) can be well approximated 
by the input differential entropy h{x\H = H). In addition, 

/(x;y|H = //) =/i(x|H = 7^)-/i(x|y,H = 7^) (45) 
= h{x\H = H)-h{e\y,H = H), (46) 

where e = x — FMMSsy, and ¥mmse is the minimum mean-square error (MMSE) estimation filter of x 
given y. 

Since the lemma is to lower bound the ergodic channel capacity, any rate achieved by a particular signal 
can serve as a lower bound. We select the transmitted signal x such that £" [x] = and Sxx — E [xx^] is 
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positive definiteQ In the following, we will compute the rate achievable by signals with these properties. 
This achievable rate obviously lower bounds the capacity. 

Denote Sxy = E [xy^] . According to the Orthogonality Principle, we have 

^MMSE = SxySyy 

= {H^H+S-,')-^H^ (47) 

where the matrix inversion lemma 

(A + BCD)"^= A-^-A-^B(C" VdA-^B)"^DA-^ 

is used. E [ee^] can be computed as 

E [ee^] = S,,-S^H\HS,,W+1)-^HS,, 

= (H^H+S;^)-^ (48) 

where the matrix inversion lemma is again used. Note that E[e\ =0 since E[x] =0 and E [y] = 0. Thus 
the CO variance matrix of e, denoted Cov[e], is equal to £[ee^]. Then we have 



/i(e|y,H = //) < /i(e|H = //) < logdet (Tif-Covle]) 

= logdet {neE[ee^]) 



(49) 



Define 



/(x;y|H = H)= h{x\U = H) logdet (^^^[ee^]) 

= /z(x|H = //)+ logdet (^^(//^//+S-^)^ . (50) 



Obviously, /(x;y|H = H) < /(x;y|H = H). We have the ergodic capacity 

m 

C = max/ (x; (y, H)) = max En [/(x; y |H)] 

./x(x) ,/x(x) 

max£'H[/i(x|H)-/i(e|y,H)] 

/x(x) 



* Since space-time codes are open-loop solutions for which the transmitter does not have the channel state information, with 
identical complex Gaussian distributions of the fading coefficients among antennas (as assumed in Section |llj, a reasonable 
selection is to distribute the transmission power evenly on all the transmit antennas, and let E[x] =0 for power efficiency. 
Together with additional selections, for example, simply letting the entries of x be independent of one another, Sxx becomes 
positive definite (when the average total transmission power is not zero). 
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where /x(x) is the probabiUty density function of x subject to (|5]l and The ergodic capacity is lower 
bounded by 



C = max [^(x;y|H)l > max r/(x;y|H) 

,/x{x) ,/.(x) 



/i(x|H) +logdet (^^(H^H + S-i)^ 



: max En 

/x{x) 



>max£H[/i(x|H)]+ 

/x(x) 



log det 



( -(H^H 



/,(x)=/;(x) 



max h{x) + ( En 

Mil-) 



logdetf-^(H^H + S^i; 



(51) 
(52) 
(53) 



/x(x)=/;(x) 



where /x (x) = argmax^n [/i(x|H)] = argmax h{x). /x (x) and C' can be obtained by solving the following 

,/x(x) ./x(x) 

problem 



max h(x) 

/x(x) 

s.t. Tr (fix [xx^] ) < P 



(54) 



\Xi\ 



|2j < Pi, i=l,...,m. 
Due to the circular symmetry of the constraints ([5]) and polar coordinates 

x=[xi,X2...,x,„f = [riej^',r2ej^\...,r„,ej^"f 
n>0, e, G[0,27l) 

are found convenient, where r, and 0, stand, respectively, for the amplitude and phase of Xj. Straightforward 
transformation yields 

h{x) = - [ /x(x)log/x(x)^/x = - / f,^{r,Q)log^^^^^drdQ 

1=1 

= /j(r,e)+f;(^| f,,{n)logndri 

where r and 9 are vectors consisting of r, and 9,, respectively. Note that 

h{r,Q) < h{r) +/i(9) < /i(r) +mlog In. 

Therefore, to maximize h{x), we should choose r and 9 independent of each other, and all 9,- distributed 
independently and uniformly in [0,27i). Then the equality holds and 



h{x) = /j(r) +£ log r,Jr,-^ +m log 271 
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Similarly, 

m 

h{v)<Y^h{ri). 

i=l 

Choosing r, independent of one another, the equality holds and h{x) is maximized^ Drop the last term 
of h{x), and transform ( |54l ) into the following equivalent optimization problem 



max X^/i(r,) + ^( / frXri)logndn 

fr{r) \f^i fi\J 



max 

/r(r) 



s.t. Tr(£:r[rr*]) <P 



< p,, /= l,...,m. 



For each antenna /, given the transmission power Pi such that 

in 

and a PAPR constraint p,, similar to |[T9l . the optimal solution /*(r,) is (see Appendix iBt 

/* (r,) = a,r,- exp(-Z7,r,V2) , Vr,- € [o, ^/piFi 

f*{ri)=0, Vr,-^[0,V^" 
where a,, bt satisfy ([57]), dM) or (1591 ): 

when p, 7^ 2, p, > 1 
?(l-exp(-Z7,p,P,/2)) = l 

2{ai/bi){biPiPi)-'[l - (1 +^p,P,/2)exp(-^p,P,/2)] = 1/p,-, 



(55) 



(56) 



(57) 
(58) 



when p, = 2 
2 



, ^ = 0. (59) 



'Note that the selection of independent 9,'s and r;'s is one of the possible selections we made in the previous footnote to 
make Sxx positive definite. 



DRAFT 



24 



Denoting the maximum of /i(x,) as /i*(x,) and c, = biPiPi, we can compute h*{xi) directly by using /*(?"/ 



h* {xi) = - log ai + -^ + log 271 

= -loga, + — ^+log27t 
2p, 



logP,- + log P-('-^''P(-^-/2)) + ^ + log 271, p,- ^ 2, p,- > 1 



(60) 
(61) 

(62) 



log27lP,-, p, = 2 

From dST]), (1 -exp(-Z7,p,P,/2))-^ = |. By substituting | in ([581) with (1 -exp(-Z7,p,P,/2))-' and then 
replacing Z>,p,/', with c,, we will arrive at 

2 1 ^ ^ _ 1 

Ci 1 -exp(-c,/2) p; 

which indicates that 1/p, is a monotonic function of c,, as shown in Fig. ID Thus, when p, > 1 is fixed 
and finite, c, is a finite constant. With the independence between a:,'s, 

m 

r(x) = ^r(x,). 

1=1 

Now we can plug /i*(x) and the corresponding (independent) distribution of x into (1531 ) to obtain the 
lower bound C' of the ergodic capacity. Let 

log P-'^-"f^-/^» +|-+logf, P,^2, p,>l 
logf, p/ = 2 

which is a constant because c, is a finite constant when p, > 1 is fixed and finite. We have 

m 

C = En [log det (I + HSxxH^) ] + £ fc,- , 

1=1 

where the equality follows from the determinant identity det(I + AB) =det(I + BA). With the selection 
of equal-power allocation, P, = P/m, V/. Thus 



C>C' 



log det I + -HH' 



ni 



+ 1^1. 



(63) 



1=1 



Note that the inequality holds for any distribution of H. When p, — )• oo, a- = bi = 2 /Pi, and 



C = C'=£h 



logdet(l + -HH'f 



m 



which is the classical result without PAPR constraints. The constant (i.e., the difference between h*{xi) 
when p, — > oo and when p, is finite) is shown in Fig. |5] 
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Appendix B 

Derivation of the Optimal Signal Probability Density Functions 
We consider the following optimization problem with average power and PAPR constraints 



max 



f,{r)\ogi^dr 



s.t. £ Irr <P 



< P- 



(64) 



Note that the PAPR constraint is different from the peak power constraint considered in |fT9l , thus the 
results in [19] do not directly apply to our case. The following derivation (verification) is necessary. The 
problem will be solved through the following slightly different problem with average and peak power 
constraints 



max 

/.('•) 



s.t. E\\rn<P 



(65) 



which can be rewritten as 



max- 



s.t. /,.(r)>0, Vre[0,v/pP] 

/pP 



rVpP 

/ Mr)dr=\ 
Jo 

rVpP , 

/ r^fr{r)dr<P. 
Jo 

The optimal solution f*{r) of (l66l) is given by the standard variational techniques ifTOll 

= arcxpi-bryi), Vr G [o, ^ 
f:ir)= 0, Vr^[0,y^ 
Observe that if the first equality in ( [65l) holds, the optimal solution f*{r) of 



(66) 



(67) 
(68) 

is also the optimal 



solution of (I64b . However, the equality does not always hold. 

We discuss a, b for different values of PAPR (p > 1). When p >2, a, b satisfy 

^(l_exp(-Z,pP/2)) = l 

2{a/b){bpP)-\l - (1 +bpP/2)exp{-bpP/2)] = 1/p. 



(69) 
(70) 
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Equations ( [69l ) and dVOl) together solve ^ as a function of p and P which is illustrated in Fig. |4] with 
T = /2. Fig. |4] shows that b> when p > 2. Thus the first equality in ( [65l ) holds and is also 

the optimal solution of ( [64b . Note that when p— >oo, a = ^ = 2/P and f*{r) is the Rayleigh distribution 
as expected. 




Fig. 4. The relation between bT (bT = bpP/2 = c/2, as defined in AppendixlA] before i60\ ) and 1 /p subject to l |69l l and l l70b . 



When p = 2, a, satisfy 

^ = 0, a = l/P (71) 

In this case, f*{r) is linear and the equality in (1651 ) is again satisfied and f*{r) is the optimal solution 
of (l6l- 

For the case of p < 2, the Karush-Kuhn-Tucker (KKT) conditions for the optimization problem (|66b 
require that b>0. However, from Fig. |4l < when p < 2. Therefore, a, b for the optimal solution of 
(l66l) should satisfy (TtTI ). In this situation 

/pp 



/ r2/;(ryr=^P<P. (72) 
Jo 2 

That is, the first equality in ( [65] ) does not hold, and the corresponding PAPR value is 2, larger than p. 
As a result, f*{r) is not the optimal solution of ( [64] ). 

To obtain the optimal solution of ([64] ) when p < 2, consider the following problem with sUghtly 
different constraints 



max 

Mr) 



s.t. E rV- =P 



|r|2 < pP. 



21 _ p (73) 
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Using similar optimization techniques, the optimal solution of (1731 ). /^(r), is found to have the same form 
as ( [67l ). (|68] ) with b <0. Therefore, /^.{r) is not a Rayleigh-like distribution. We will show that f'.{r) is 
also the optimal solution of ( [64l ). Assuming that the distribution /'/{r) is the optimal solution of (|64l ) and 
/"(r) 7^ //(r), then /"(r) must be the optimal solution of the following optimization problem for some 
P" 

max (- ff^(r)logi^dr) 
Mr) \ J r J 

s.t. E[\r\^]=P"<P (^4) 
|rp < pP" < pP 

However, (1731 ) has a larger maximum value, namely {—\oga + bP/2), than that of (1741) because P > P", 
which implies that f'.{r) maximizes (l64l ). In Fig. [51 we demonstrate the maximum values of h*{x), where 
/i*(;c) = (-loga + Z^P/2 + log27i), for p = 5, 2, 1.1, oo. 
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Fig. 5. The maximum value h*{x) for different p (PAPR) values 



Appendix C 
Proof of Theorem [2l 

We follow the method in HI, letting the i-th element of the input signal be drawn from the random 
code with i.i.d. distribution f*{xi) 

fx. (xi) = ^aiexp ( - ^ 1-^/ n , -x/ G B,- 

where x, € C, B, = {xi\ < ^/p,P, }. At data rate R = rlogSNR, the error probability is 

Pe{SNR) < Pou,{R) +P{error,no outage). 
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The second term can be upper bounded via a union bound. Assume that X{0), X{1) are two possible 
transmitted codewords, and AX = X{1) —X{0). Suppose that X{0) is transmitted. The probabiUty that 
a maximum likelihood receiver will make a detection error in favor of X{1), conditioned on a certain 
realization of the channel, is 

' 1 



P {X{0) ^ X{\)\H = H) = P 



-HAX 



< w 



< 



exp 



1 



\HAX\ 



(76) 



(77) 



where w is the additive noise on the direction of HAX. Then we need to average over the ensemble of 
random codes. Let x, and x- be two i.i.d. random variables with distribution in the form of dTSl) . and 
X- — X, =x,. The probability density function of x, is 



1 



271 



' Ivf |2 



4 



—aiexp ( —bi\xi — P ) dx, 



EC, 271 



where x, G C, if x, G B, and x- G B,. We discuss different values of bi. 
For hi > 0, 



Jx/eCi 271 



exp —bi\xi 



In dxi < f ^atexp \-bi\xi 
2 / Jjc.ec 271 \ 



|2 



dxi 



where ti is a constant, which is independent of of P,. 
For bi = 0, since |x, — f | < lyJpiPi and a/P/p, = p, 



1 

271 



fl/gx/j ( —bi\Xi 



X: 



' |2 



dX: < 



1 

271 



Ujexp —ai\x, 



where t2 is a constant, which is independent of of P,. 

For bi < 0, since |x, — f-| < lyJPiPi and Z?,P,p, is a constant. 



.v,eC,- 



1 
271 



a,ex/7 1 —bi\Xi 



Xi 



> |2 



dxi < 



1 

271 



Uiexp I Z?/|x,- 



X/ 



' |2 



where fa is a constant, which is independent of of Pi. 
Thus we have 

fx.i^i) < Ci ■ -^atexp {-^\x, 



2k 



4 



1 



< dfii ■ —Ujexp ( 



< diCi ■ —aiexp 
2k 



^mm I i2 



(78) 



(79) 



f-x;? (4a,p,P,) dxi = t2 (80) 



exp {-%biPiPi) dxi = t-i (81) 



(82) 
(83) 
(84) 
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where c,- = ti, t2, or t-^. (1831 ) follows from (182] ) by using the same techniques as above and di is a constant 
independent of P,. bmin = where b'l = bi for bi > 0; = a, for Z?, = 0; Z?- = — Z?, for < 0. The 

average pairwise error probability given the channel reaUzation is 

P{X{i)^X{j),i^j\Yi = H)<K(^a)j dtt{b,ni„I + HH^y' (85) 

where ^ is a constant which is not important here. At a data rate 7? = rlogSNR, we have a total of SNR''^ 
codewords. Applying the union bound, we have 

P {error\U = H) <KSNR'' ^f] det (^„„„/ + HH^) 

=KSNR^'- f n TT^l det (/ + -^HH^\ ' 

\ (■= 1 ''mill / \ t'min / 

<K'SNR''' det (/ + SNRHH"^) ~' 

min{m,«) 

=K'SNR''' Yl {i+SNRXi)~' 

i=l 

=SNR~' [^^^'""'"' ('""'^^"'■] (86) 

where X, are the singular values of H and X, = SNR'""'. Equation (l86l ) is exactly the same as (19) of H]. 
Following the remaining steps in ||T1, it can be shown that for Z > (m + « — 1), the D-MG tradeoff with 
PAPR constraints is achievable. 
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